AITopics | word and letter

Collaborating Authors

word and letter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An End-to-End Approach for Child Reading Assessment in the Xhosa Language

Chevtchenko, Sergio, Navas, Nikhil, Vale, Rafaella, Ubaudi, Franco, Lucwaba, Sipumelele, Ardington, Cally, Afshar, Soheil, Antoniou, Mark, Afshar, Saeed

arXiv.org Artificial IntelligenceJun-3-2025

Child literacy is a strong predictor of life outcomes at the subsequent stages of an individual's life. This points to a need for targeted interventions in vulnerable low and middle income populations to help bridge the gap between literacy levels in these regions and high income ones. In this effort, reading assessments provide an important tool to measure the effectiveness of these programs and AI can be a reliable and economical tool to support educators with this task. Developing accurate automatic reading assessment systems for child speech in low-resource languages poses significant challenges due to limited data and the unique acoustic properties of children's voices. This study focuses on Xhosa, a language spoken in South Africa, to advance child speech recognition capabilities. We present a novel dataset composed of child speech samples in Xhosa. The dataset is available upon request and contains ten words and letters, which are part of the Early Grade Reading Assessment (EGRA) system. Each recording is labeled with an online and cost-effective approach by multiple markers and a subsample is validated by an independent EGRA reviewer. This dataset is evaluated with three fine-tuned state-of-the-art end-to-end models: wav2vec 2.0, HuBERT, and Whisper. The results indicate that the performance of these models can be significantly influenced by the amount and balancing of the available training data, which is fundamental for cost-effective large dataset collection. Furthermore, our experiments indicate that the wav2vec 2.0 performance is improved by training on multiple classes at a time, even when the number of available samples is constrained.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2505.17371

Country: Africa > South Africa (0.67)

Genre: Research Report (1.00)

Industry: Education > Educational Setting (0.93)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Fusion of Word and Letter Based Metrics for Automatic MT Evaluation

Yang, Muyun (Harbin Institute of Technology) | Zhu, Junguo (Harbin Institute of Technology) | Li, Sheng (Harbin Institute of Technology) | Zhao, Tiejun (Harbin Institute of Technology)

AAAI ConferencesAug-3-2013

With the progress in machine translation, it becomes more subtle to develop the evaluation metric capturing the systems’ differences in comparison to the human translations. In contrast to the current efforts in leveraging more linguistic information to depict translation quality, this paper takes the thread of combining language independent features for a robust solution to MT evaluation metric. To compete with finer granularity of modeling brought by linguistic features, the proposed method augments the word level metrics by a letter based calculation. An empirical study is then conducted over WMT data to train the metrics by ranking SVM. The results reveal that the integration of current language independent metrics can generate well enough performance for a variety of languages. Time-split data validation is promising as a better training setting, though the greedy strategy also works well.

automatic mt evaluation, fusion, word and letter, (1 more...)

AAAI Conferences

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.87)

Add feedback